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Abstract 

We study returns in dynamical systems: when a set of points, initially 
populating a prescribed region, swarms around phase space according to 
a deterministic rule of motion, we say that the return of the set occurs 
at the earliest moment when one of these points comes back to the orig- 
inal region. We describe the statistical distribution of these "first-return 
times" in various settings: when phase space is composed of sequences of 
symbols from a finite alphabet (with application for instance to biological 
problems) and when phase space is a one and a two-dimensional man- 
ifold. Specifically, we consider Bernoulli shifts, expanding maps of the 
interval and linear automorphisms of the two dimensional torus. We de- 
rive relations linking these statistics with Renyi entropies and Lyapunov 
exponents. 



1 Introduction 

In this paper we investigate a phenomenon of vast relevance in physics: returns. 
Whenever a system evolves according to a deterministic, or even a probabilis- 
tic law, along the course of time it may pass close to points previously visited. 
We then speak of returns and of return times. Of course, this concept can be 
made precise and rigorous: this has been done since the beginning of the theory 
of dynamical system, where Poincare theorem — establishing that in measure 
preserving systems returns happen almost surely — is probably the first and cer- 
tainly the most celebrated result. Passing via Kac theorem and coming to recent 
years, mathematical investigation has flourished and produced beautiful results 
relating the statistics of return times, i.e. the collective counting of these values, 
to more conventional dynamical indicators, such as generalized dimensions of 
invariant measures, Lyapunov exponents and the like. This paper continues in 
this ongoing investigation, with a special character: rather than presenting a 
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single, thoroughly investigated result, we attempt to provide a heuristic, global 
picture of the dynamical phenomena that are at work. This picture will be con- 
firmed by numerical experiments. Rigor and detailed proofs will be the matter 
for successive publications. 

There exist many different alternatives when defining returns: in this paper, 
we make specific reference to what is called the first return of a (measurable) 
subset A of a compact metric space, endowed with a measure [i defined on the 
Borel sigma algebra A. Motion on X is effected by the action of a transformation 
T that preserves the measure [i. We thereby define the time of first return of A 
into itself as 

t{A) := min{/j > s.t. T k A fli^l}. (1) 

As anticipated in the abstract, this is the time of the earliest return to A of one 
of its points: 

t(A) = min{/c > s.t. 3x G A; T k (x) G A}. (2) 

Two choices will be made for A. Firstly, A will be a ball of radius e, centered 
at a point x £ X: B E (x). Secondly, A will be a dynamically generated cylinder. 
Let us suppose that C is a finite partition of X and take the n-join, C n :— 
V^Zq T -8 C We call cylinder of length n around x € X , denoted with C n (x), 
the unique element of C n containing x. The statistics of return times are then 
defined by the collective counting, over different sets, of the values t(A). The 
distribution of return times, p(e, fc) is 

p(e, k) := n({x G X s.t t(B s (x)) = k}); (3) 

similarly, p(n, k) is defined replacing B e (x) by C n (x). They measure the fraction 
of points in the space X whose neighborhood (whether a ball of radius e, or a 
cylinder of length n) first returns to itself after k iterations of the map. We 
shall also consider the cumulative distributions (integrated statistics) P(e, k), 

k 

P(e,fc):=5>(e,j) W 

3=1 

and P(n,k). Our aim will be to study the behavior of these distributions in 
systems that are simple enough to permit both a theoretical analysis and a 
precise numerical simulation. 

The time of first return of sets (p} arises in several circumstances. Since 
it controls the shortest return time of points in the set, it plays a crucial role 
to establish the asymptotic (exponential) distribution of the return times of all 
points to the set A, when the measure of the set A goes to zero, a different 
and much investigated topic [TH [31 [TJ [321 1201 US] ■ In addition, it has been 
used to define the recurrence dimension, being used as the gauge set function 
to construct a suitable Caratheodory measure [H [551 1Z] • Finally, it has been 
related to the algorithmic information content [llj . 

Returns of sets is also relevant in applications, like those of biological interest. 
In fact, when the space X consists of sequences of symbols from a finite alphabet 
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(think e.g. of DNA sequencing) particular words, or motifs have been found to 
be related to biological mechanisms like transcription sites or protein interaction 
(see for instance [35]). It is then important to quantify the statistical properties 
of these words within the genome. In particular, a typical word of length n 
(that is, a finite sequence of n letters) will recur within a time of the order e nh , 
for large n, where h is the metric entropy of the system, as predicted by the 
Ornstein- Weiss theorem [23]. Yet, if we look at the delay of the first recurrence 
of the same word, when observed in the whole (possibly infinite) sequence, this 
scales as n [2H1 E] • This time of first return is precisely the quantity studied in 
this paper. 

The plan of the paper is the following: in the next section we review a few 
results that are useful for the understanding of the paper. For this reason we 
neither need nor claim completeness. In Sect. [3] we consider the case of return 
times in cylinders for Bernoulli systems. Quite evidently, this is the simplest 
setting where to study return times of sets. We first present a heuristic expla- 
nation of the results rigorously proven in [3] and [3T] that permit to compute, 
via return times, the Renyi entropies. Then, we refine this analysis to obtain 
new results on the type of convergence of the conventional quantities and we in- 
troduce a new one, for which convergence is much faster. Moreover, our theory 
allows us to obtain a description of the different asymptotics of p(n, k) in the 
(n,k) plane. In Sect. H]we leave the symbolic description to enter a geometric 
setting by considering expanding maps of the interval. We show how results 
proven in the symbolic setting can be adapted to describe the distribution func- 
tion p(e, k). In particular, we obtain a formula for the asymptotic behavior of 
this function when k and — log e grow while keeping a constant ratio, that in- 
volves the Lyapunov exponent and the Renyi entropies. While the treatment of 
Sect. @]is tailored on the specific system under investigation, the following Sect. 
[5] introduces a more general approach, that confirms the results of the previous 
section. Following these ideas, in Sect. [6] we consider the case of linear auto- 
morphisms of the two-dimensional torus, that can be investigated completely. 
We also conjecture a general form for the constant k over — loge asymptotics 
described above. In the Conclusions we review the new results presented in this 
paper. 



2 Review of known results and definitions 

Many facts concerning return times of sets are known. In this section we review 
a few of these results that are relevant for the understanding of this paper. If the 
dynamical system (X, /i,T) has positive metric entropy, h^, it has been proven 
(211 [6] that: 

Hminf I^M>l. (5 ) 

The limit of the quantity above exists and is equal to one /i-almost every- 
where in certain cases, including irreducible and aperiodic subshifts of finite 
type, systems verifying the specification property [6] and even non-uniformly 
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hyperbolic maps of the interval [H] • It is therefore of importance to study 
the measure of the set of points that deviate from the almost-sure behavior, a 
quantity that typically decays exponentially in time. To do this precisely, one 
defines the deviation function M(S), 

M{8):= lim -log^iel s.t L^Ml < g}). (6) 

In the language of the previous section this becomes 

M(6)= lim -logP(n,Sn) (7) 

n — >oo fi 

In [4] it has been proven that, for i/j-mixing dynamical systems with some 
restrictions (see the original paper for definition and further specifications) the 
limit above exists and is related to the generalized Renyi entropies H of the 
invariant measure \i via 

M(6) = (S-1)H(- S -1), (8) 

whenever the latter function exists, being it defined via a summation over all 
cylinders C of length n, 

H(J3):=~ lim - log £ ^{Cf +l . (9) 

Observe that meaningful values of 5 in eq. (J6j) range from zero to one, so that 
(3 = A — 1 is always larger than zero. For non-integer values of j, a linear 
interpolation of the values provided by eq. (JSJ) applies. 

Renyi entropies have been introduced in [27! and they have been extensively 
studied for their connections with various generalized spectra of dimensions of 
invariant sets, see for instance [TBI El [TDl HZl H31 ED El] . The restrictions 
in [4] have been removed in [21j and in this last paper the Renyi entropies have 
been proved to exist for a weaker class of ^-mixing measures. 

On the other hand, one might try to answer similar questions when balls are 
considered in place of cylinders. For instance, if B e (x) is the ball of radius e 
centered at x £ X, the natural generalization of the limit ([5]) is the quantity 

v{x) := Hm (io) 

£^0+ — fog £ 

For a large class of maps of the interval, it has been proved in [53] that rj(x) 
exists for /z-almost all x and is equal to the inverse of A, the Lyapunov exponent 
of the measure fi. For hyperbolic smooth diffeomorphisms of a compact manifold 
a similar result holds, to the extent that 

1 ,. . „t(BJx)) ,. t(BJx)) 1 . . 

— < liminf v p " < lim sup ^ - V " < — , 11 

A u ~ e^Q -logs ~ ~log£ ~ A u 
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where A" and A s are the largest and the smallest Lyapunov exponents, while 
A" is the smallest positive Lyapunov exponent and A s is the largest negative 
Lyapunov exponent, see [30 . In the case of diffeomorphisms in two dimensions, 
the above formula leads to the equality 

lim i^M) = J_ * J__J_ = ^M. (12) 

e^Q -loge A" A s A" X s y J 

The last equality above is Young's formula, in which Di(^l) is the information 
dimension and is the metric entropy of the measure \x. One of the goals of 
this paper is to generalize the results (O, ([8]) in the case of balls. 



3 Return times in Bernoulli shifts 

In this section we take a close look at Bernoulli processes, that are the simplest, 
yet significant, example of i/>-mixing systems. For these, eq. ([5]) holds and the 
function H can be easily computed. In our view, this result is particularly sig- 
nificant, because it relates a thermodynamical quantity, the spectrum of Renyi 
entropies, to the statistics of return times. Our aim is to investigate the kind 
of convergence holding for eq. (|7|) and more generally the form of the distribu- 
tion p(n, k). We shall also introduce a slightly different quantity than P(n, Sn), 
still based on return times, that also yields M(S) in the limit, but for which 
convergence is much faster. 

Let us start from notations: we consider the full shift on the space of se- 
quences of M symbols, £ :— {0, ...,M — 1} Z + . Let us stipulate that the 
cylinders in C n can be written as C a0! ... jCrn _ 1 , where m € {0, ...,M — 1}, for 
i = 1, . . . ,n — 1. For short, we shall sometimes write a for the vector of indices 
CTj, and \a\ will be the length of this vector, a is also called a "word" in symbolic 
language. With a slight abuse of notation, we shall sometimes write t(ct) and 
/i(cr), the return time and the measure of the word <r, in place of T(C a ) and 
n{C a ), the return time and the measure of the cylinder C a labeled by the word 
a. Similarly, we shall let £„ :— {0, . . . , M — 1}" indicate the set of words of 
length n and C n the set of the associated cylinders, at times interchangeably. 

The Bernoulli invariant measure on S is induced by the set of probabil- 
ities {7Tj}, j = 0, . . . , AI — 1. For simplicity, in the numerical simulations, 
we shall consider the two-symbols (coin toss) Bernoulli game of parameter q: 
ttq = q, 7Ti = 1 - q. 

Fig. Q] depicts the function p(n, k) for q = 0.3. Two features are evident: 
the first, is the slow decay of p(n, 1) with increasing n. The second is the much 
faster decay of p(n, 2). Both these features, and more, can be explained by 
computing the asymptotic behavior of p(n, k) for k fixed and large n. 

The case k = 1 follows immediately from the observation that the only words 
a for which t(ct) = 1 arc composed of a single symbol: oi — j for all i, where 
j € {0, . . . , M — 1}. The measure of the associated cylinders is 7r", so that 
p(n, 1) is the sum of these quantities over all j, and behaves asymptotically as 
a", where a = max{7Tj}. 
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log(p(n,k)) 




Figure 1: Distribution function p(n, k) for the Bernoulli game with q = 0.3. 

Let now T n ^ the set of words of length n with first return time k: 

T nM = {a € £„ s.t. t(ct) = k}. (13) 

The first observation is that for any a £ T n> k the symbols <7j repeat periodically 
with period k: 

CTfc+j = aj, j = 0, . . . , n - k. (14) 

Define the periodic replication operator P n ' k as that which takes any word a of 
length k into the word a' = P n,k (a) of length n > k, that satisfies eq. (fill . It 
is of relevance now to consider the set of words W n ,k defined implicitly by 

T nM = P n > k (W n , k ). (15) 

The meaning of this definition is simple: words in W n ,k have length k and are 
all and the only periodic "roots" of words of T n .k'- 

W n ,k = {a e S fe s.t. T(P n ' k {a)) = k}. (16) 

Controlling W n ,k is then the same thing as controlling T n ^- 
It is not difficult to prove that for any k 

%,k = Wfe, fe c Wfc+i.A, c . . . W 2k ,k = W 2 fc+i,fc = • • ■ = W2k+j,k, (17) 

for any j > 0. Therefore, when n > 2fc the set of "roots" W n! fc is constant in n. 
Define 

m(fe) = max s.t. cr, = 0}}. (18) 
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This is the maximum number of zeros in a word in W2k,k- Since all sets W n ,k 
are invariant for permutation of the symbols {0, . . . , M — 1}, m(fc) is also the 
maximum number of any other symbol in a word in V\?2k,k- Suppose now that 
only one symbol in {0, . . . , M — 1} has probability a, so that the probability of 
all other symbols is less than a. For simplicity, let us only consider the case of 
M = 2. Then, the probability of the word for which m(k) is obtained gives the 
leading term in the asymptotic of p(n, k): 

log(p(n, k)) ~ -r[m(k) log a + (fc — m{k)) log(l — a)]. (19) 
k 

It is also rather easy to see that m(k) = k — 1. In fact, m(k) — k is possible 
only for k = 1, and <Jk-i = 1, 0j = for j = 0, . . . , fc — 2 is a word in W2k,k- 
Then, 

i Iog(p(n, fc)) ~ (1 - -) log a + - log(l - a). (20) 

This formula gives the decay rates of p(n, fc). They are increasing functions of 
fc > 2, the most negative being precisely that of p(n, 2), and tend to log(a) when 
fc goes to infinity. 

Let us now consider the asymptotic behavior of p(n, fc) over the line fc = 5n, 
where < S < 1. Firstly, it is clear that W n ,k ^ £jt, since some words of length 
fc may be associated with shorter return times than fc, and one the technical 
achievements of refs. [H [21] is to deal properly with this issue. This fact 
notwithstanding, we may begin by assuming heuristically that among all words 
a in Efe, those associated with return times smaller than fc and therefore not 
in W n} k, are statistically negligible, in some limit. More refined arguments will 
follow later in this section. Under this assumption, the probability p{n, fc) can 
be approximated by a sum over all words of length fc: 

p(n,k)= £ KP nik (°)) - E ^( P "*V))- (21) 

Formula (f2Tj) can be further developed by estimating the cylinder measure /x(cr'), 
where a 1 — P n ' k (a). Since |er| = fc, \a'\ — n, and since a' satisfies cq. (fT4|) . 

log/V) = -log/i^), (22) 

exactly whenever ? is an integer (and approximately in the other cases) so that 

p(n,k)= M(C CT )" /fc - (23) 

If we now set fc = (5n, with < 5 < 1 and (5 the inverse of an integer, we can 
estimate 

lim -logp(n,Sn)=6 lim \ log( V n(C a ) 1/s ). (24) 

n— >oo n fe— too fc 
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If we now compare this equation with the definition of Renyi entropies, eq. @, 
we easily obtain that 

M(S) = lim - kWn, Sn) = (S - 1)H(\ - 1). (25) 

n^oo n O 

Observe that this heuristic result has been derived for p(n, k) rather than 
P(n, k), for which it is known to hold rigorously-modulo the linear interpola- 
tion required for non-integer values of 1/5. Indeed, numerical experiments on 
Bernoulli schemes with two symbols (coin toss) show that formula (1251) holds. 
For instance, Figure |4] shows the logarithm of both p(n, Sn) and P(n, Sn) versus 
n, for q = 0.3 and 5 = 0.5, compared with a straight line gs(n) = an + b with 
slope a — (S — l)H(h — 1). Both quantities are well fitted by a function of the 
kind gs (n) + ce dn . 

To validate this picture, confirmed by numerical experiment on other values 
of q and S (for S not equal to the inverse of an integer, a linear interpolation 
formula applies [HE]) we plot in Fig. [5] the logarithm of the difference between 
data and the straight line in Fig. 2J An exponential decay is clearly observed. 

In conclusion, numerical experimentation support the hypothesis that the 
asymptotic behavior in eq. (|25p is attained with a decaying exponential term 
parameterized by the constants c and d < 0, together with a slowly decaying 
contribution arising from the constant b: 

- logp(ra, Sn) ~ M(S) + - + -e dn . (26) 
n n n 

We shall introduce momentarily a new quantity to improve on this kind of 
convergence. 

The same heuristic arguments imply an approximate form for the behavior 
of the distribution function for different n and k. Carrying out the computations 
of eq. (f2"3"| together with eq. §§§ we obtain 

77 

log(p(n,fc))~(fc-n)ff(--l). (27) 

We can test this approximation in the case of the Bernoulli scheme with 
q = 1/2. Here, trivially the approximation (|2~T|) becomes log(p(n, k)) ~ (k — 
n) log(2)). This function fits almost perfectly the numerical data in Fig. [2j 

A less favorable case is offered by the Bernoulli scheme with q = .3. In Fig. [3] 
we plot P(n, k) and p(n, k) versus k for n = 30, together with the approximation 
provided by eq. ([27]) We notice that this latter fits well both functions at k = 1 
(quite obviously, being this behavior associated with the measure of the cylinder 
of unity return time, see above), while approximation stays reasonable only for 
P(n,k) at small values of k > 1. Then, in the intermediate region the slope 
of both curves agrees with the interpolation while the numerical value only for 
p(n,k). 

Certainly, the agreement observed in the last figure is far from satisfactory. 
The reason is to be found in the approximation made in eq. (|2ip . The same 
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Figure 2: Distribution function p(n, k) (crosses, red lines) and the approximation 
function in eq. (|27p (x, green line) versus n and k in the case of a Bernoulli 
game with q = 1/2. 

fact is at the origin of the slow convergence observed in eq. (|26[) . We con- 
clude this section unveiling these reasons and providing a more rigorous and 
insightful treatment of the problem. This improvement is inspired by the idea 
of summation over prime periodic orbits in dynamical systems. 

As we mentioned, the approximation in eq. (|21[) above is based on the idea 
that the roles of and W n ,k can be interchanged, the effects of their difference 
being negligible in the limit. The price to pay in this procedure is the slowly 
decaying term in the asymptotics (126|) . We can be more careful: indeed, we 
can show that Sfc can be rigorously partitioned into the periodic repetition of 
different sets W n ,k'- The lemma is the following: for any n > 2fc 

E fc = \J P k ' k ' {W n ,k>), (28) 
k'\k 

where the union is over all integer k' that divide k and where the sets P k,k (W n ,k' ) 
are the full completion of p- cycles of the word of length k'. These sets are pair- 
wise disjoint. 

On the basis of this lemma, we can reverse the ordering in eq. (|2ip to get: 
£>(P n 'V)) = £ E riP n ' k °P k ' k '(*))- (29) 

o-<E£ fc k'\kcreW nik i 
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Figure 3: Distributions P(n, k) (X, blue line), p(n, k) (crosses, red line) and the 
approximation function in eq. (|27p (green line) versus k for n = 30, in the case 
of a Bernoulli process with q = 0.3 



Since k' divides k, P n - k o P k ' k = P n - k and therefore the chain of equalities 
continues with 

5>0P"' fc ^)) = E E M0P ri ' fe V)) = E E (3°) 

where we have used eq. (fT5|) . Finally, 

Em^w) = E E MW = E^ fc ')- (si) 

o-SSfc k'\k o-£T n S ., fe'|fe 

On the other hand, 

M (P"'V)) = (MO0)«, (32) 

exactly when 5 is an integer, and approximately otherwise, so that choosing 
k/n = S, so that Sn = k is an integer, and defining the new quantity 

Z p (S,n):= Ep("^')- (33) 

k'\Sn 



wc find that 



M p (<5) = lim - log(Z p (<5, n)) = {8- l)H{- - 1). (34) 
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Figure 4: Distribution function p(n, Sn) for a Bernoulli process with q = 0.3, 
S = 0.5 (X) and cumulative distribution function P(n, Sn) of the same process 
(crosses). The first function has been shifted upwards by a fixed quantity to 
match the latter. Both sets of data are consistent with the behavior described 
in the text: The straight blue line has indeed slope (S — 1)H(^ — 1), and the 
green fitting curve (which becomes asymptotically tangent to the blue line) is 
given by eq. (f!?u]) . 

This formula is valid for all < 8 < \ that are the inverse of an integer. For the 
other values, the linear interpolation between the values at the nearest inverses 
of an integer applies. 

Numerical verification follows: in Fig. [6] we plot the logarithm of P(n,Sn), 
p(n, Sn) and of the period summation function Z p {5, n) versus n for the Bernoulli 
process with q — 0.3. The line gs(n) = M(5)n fits almost perfectly the last set of 
data. On the other hand, the other two functions display the slow convergence 
discussed above. To further appreciate the improvement brought about by using 
Z p consider Fig. where we plot the difference between successive values of the 
logarithm of the above functions, and these logarithms divided by n. All these 
quantities have limit M(S). We compute this value from the linear interpolation 
of (6—i)H(i — i) to get the value tabulated. The distributions p and P converge 
slowly, while Z p gives a numerically exact result. In conclusion, the term b > 
plaguing the convergence in eq. (|26[) was due to approximate counting and does 
not show up for the newly introduced quantity. 



11 



_"l I I I I I 

5 10 15 20 25 30 

n 

Figure 5: Logarithm of the difference between the logarithm of the cumulative 
distribution function P(n, Sn) of the Bernoulli process with q = 0.3, 5 = 0.5 
and the straight line gs{n) in Fig. [4] The fitting straight (green) line implies 
the exponential decay used in the fit in Fig. [H 

4 Expanding maps of the interval 

In this section, we study a family of one-dimensional dynamical systems for 
which we can derive both a formula for the asymptotic distribution and a gen- 
eralization of the deviation result. This family will also serve to begin to under- 
stand the dynamical phenomena occurring when considering ball, rather than 
cylinder, return times: one has to match geometry and dynamics. Further detail 
will be added in the following section. 

We start by constructing a family of measures supported in [0, 1] by means 
of an affine iterated function system: given a set of M non-overlapping intervals 
Ij = [a.j,bj] C [0, 1] j — 0, . . . , M — 1, define the lengths Sj = bj — aj, and 
construct the affine maps 

(f>j(x) := Sjx + a j: j = 1, . . . ,M. (35) 

Each map 4>j takes [0, 1] into [aj, bj]. Consider then the set action $ that maps 
the set A C [0,1] into $(A) := [j]^ 1 <j)j(A). Repeated action of $ on [0,1] 
defines a Cantor set S in [0, 1]: 

oo 

S:= fl^([0,l])- (36) 

k=l 
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Figure 6: Logarithm of the cumulative distribution function P(n, 5n) (red line, 
pluses) and of the distribution function p(ro, Sn) (blue line, stars) together with 
log Z p (S,n) (large crosses, light blue) and the line gs(n) = M(8)n (magenta) for 
the Bernoulli process with q = 0.3, S — 0.4 



We can then construct a family of measures whose support is this set S: 
let us choose real numbers {itj}j=i,...,M such that nj > 0, YljLi^j = 1> an d 
consider the unique measure /i for which 

„ M-l 

/ f(s)dfi(s) =J2^(f° <l>s)(s)dn(a), (37) 

J j=0 J 

holds for any continuous function /. It is then easy to show that, for any choice 
of the set of real numbers {7Tj}, the measure fj, is mixing for the piece- wise linear 
transformation T defined on S by: 

T(x) = -^-(x- aj), if x G Ij = [dj,bj]. (38) 
°j 

In fact, the maps {4>j} turn out to be the inverse branches of T. As such, they 
can be employed to build the cylinders C n of this dynamical system: letting 
<Ji € {0, . . . , M — 1}, for i = 1, . . . , n — 1 these latter can be labelled as 

C CTD ,.., CT „- 1 = (0 CT o°---°^„- 1 )([O,l])- (39) 
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Figure 7: Various functions for of the Bernoulli process with q — 0.3, 5 = 0.4. 
Start with log(p(n, 8n))/n: black curve, circles; log(p(n, Jn)) — log(p(n— 1, 5{n — 
1)): blue curve, stars; log(P(n, Sn))/n open squares, magenta; log(P(n, Sn)) — 
\og(P(n — l 7 5(n— 1))): crosses, red curve; \og(Z p (n, 5))/n: yellow squares; 
\og(Z p (n, 5)) — \og(Z p (n — 1, 5)): crosses, green curve. The last two set of data 
sit to numerical precision on the line h = —0.307795889757108 that is obtained 
by the interpolation formula. 

Following eq. (|35|) . the geometric length l(C a ) is easily computed: 

n-l 

i(C ao ,..., an _ 1 )=l[8 ai . (40) 

i=0 

Equally easily, because of eq. (f3"7) . the measure n(C a ) is 

n-l 

tiC*o,...,<r n - 1 )=H** t . (41) 

4=0 

Therefore, this dynamical system is metrically equivalent to a Bernoulli shift 
on M symbols, with probabilities 7Tj, i = 1, . . . , M — 1, that we have discussed 
in section [3] Yet, when examining the distribution of ball return times, we 
must investigate is the geometrical relation between balls of a fixed radius and 
cylinders of different symbolic length. 

In order to analyze this relation, we observe that any ball B £ (x) can be 
written as a union over cylinders of an appropriate (fixed) length n: that is, for 
all x and e there exist n and a collection of indices a £ I, \a\ = n such that 
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log p(e,k) 




Figure 8: Distribution function p(e, k) for for the dynamics of the two map IFS 
with Si = 0.3, S 2 = 0.2, m = 0.3 and tt 2 = 0.7. 



B £ (x) = |J C a . (42) 

crEl 

This is a consequence of the fact that these measures are singular w.r.t. 
Lebesgue and their support have gaps of positive length. Therefore, for suffi- 
ciently large n, the two boundary points of the ball end up in the closure of a 
gap in the support of \x. 

Furthermore, since both n and / depend on the ball under consideration, we 
define the symbolic length of B £ (x) as 

N £ (x) =n- max{j £ Z s.t. M j < #(/)}, (43) 

where #(/) is the cardinality of the set I and where M, as before, is the number 
of inverse branches of T. The idea behind this definition is to measure a sort 
of effective length of the cover of B e (x). For instance, if eq. (|4"2")l would require 
#(/) =4 cylinders of length n = 3 with M — 2 we would effectively consider 
this union as if it were a single cylinder of length N e = 1. 

This definition is instrumental in formulating a working hypothesis: we sur- 
mise that the statistical distribution of return times of boxes B £ (x) characterized 
by the same symbolic length N £ (x) — n will scale as that of cylinders (again, 
of that given length n) in a Bernoulli shift. This hypothesis is confirmed by 
numerical computation. At a fixed radius e, we compute the return time dis- 
tribution for all points x with a fixed symbolic length, computed via eq. (I43|) 
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x 

Figure 9: Distribution function p(n,r) for return times of balls of radius e = 
8.989 1CP 12 and n = N s = 20 (red line, crosses), for the dynamics of the two 
map IFS described in Fig. [8j It is compared with the distribution function 
p{n,r), n = 20 of the Bernoulli process with q = 0.3 (green line, X's). See text 
for details. 

and we compare it with the discrete distribution of the corresponding symbolic 
Bernoulli process. Data reported in Figure [5] are obtained for a two map I.F.S. 
dynamics. Accordance is significant. 

Therefore, to obtain the distribution of return times of balls of fixed radius e 
we must know the cylinder return time distribution, p(n, £:), discussed in Sect [31 
but also the measure of center points whose balls have a given symbolic length: 

*(e,n) = s.t. N e (x) = n}). (44) 

In fact, from this information, we obtain 

p(e, fc) = ^(e, ri)p(n, k). (45) 

n 

To estimate ^(e, n), we consider, this time at fixed n, the distribution of the 
geometric lengths of cylinders, £(C a ): let z = log(e), and define 

#,«):=7K("'t- log(l(C n (x))) < z}). (46) 

When n is sufficiently large, this can be approximated by a continuous dis- 
tribution, precisely, by a normal distribution J\f_\ n s^/ni z ) °f me a n —An, and 
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variance S n, where A is the Lyapunov exponent of //, and S the standard de- 
viation of the multiplicative process. Both quantities can be easily computed in 
this case: 

M-l 

A = -^ 7 r i log(^), (47) 

3=0 

and 

M-l 

5 2 = -A 2 +X;^(log(«i)) 2 . (48) 
j=o 

We now conjecture that we can exchange the role of z and n in this derivation, 
so that \&(e, n) (with e fixed) be approximated by the distribution ip(z, n) (with 
z = log(e) fixed) when properly normalized and, in turn, with N-\ n ,Sy/n{ z )i 
with z = log(e) fixed, and properly normalized via the constant A to yield the 
discrete distribution D e {n))\ 

D e (n)=AAf_ XniSV z(logs), ^D e (n) = l. (49) 



log(*(e,k)) 




Figure 10: Distribution function ^(e,n) (red lines, points) and N(z,n) (blue 
lines) compared (for definitions, see text), in the case of Fig. [H 

This conjecture is validated numerically in Figure [TUl that reports D £ (n) 
and ^(e, n) for the same case of Figs. [3 [9] Indeed, for our purposes we do not 
need the exact form of the limit distribution, but only its scaling behavior in n 
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and e. In conclusion, we can write the distribution function p(e, k) as 

p(e,fc) = ^ J D e (n)p(n,fc), (50) 

n 

where D e (n) has mean n = —z/A and standard deviation s = \^ 3 ^ 2 S^/—z, 
sharply localized in the interval [n — s,n + s]. This explains the shape of the 
graph reported in Fig. [8j that is to be compared with that of Fig. Q]in Section 
02 the Gaussian smoothing is particularly evident near the line k — — log(e)/A. 

Finally, eq. (f50|) is the basis to derive formulae akin to those of Sect. [3l In 
particular, it validates the analogue of formula (p25"| that becomes the fundamen- 
tal result of this section. For one dimensional expanding maps, the following 
asymptotic formula holds, that links the asymptotic of return time distributions 
to the Lyapunov exponent and to Renyi entropies: 

lm logp( e ,-*log( e )/A) = 1-6 H 1 _ 
e^a loge A 5 

The same observations about the convergence speed of this limit detailed in 
Sect. El apply here. 



5 A second approach to expanding maps 

In this section, we present a second, more general approach to the statistics of 
first returns in balls for one-dimensional piecewise linear expanding maps of the 
type studied in the previous section. This approach can be fruitfully extended to 
more general situations and, informal as it is now, clearly points to the direction 
where rigor can be achieved. 

Recall the theory of Sect. G2 the word labelling each cylinder of length k 
was continued periodically to length n to single out a cylinder of length n and 
return time k. Geometrically, for the kind of maps studied in Sect. [51 each 
cylinder a of length k contains a periodic point of the map, x& , of period k. 
Balls of radius e centered at a point x located in the vicinity of such fixed point 
have a non-empty intersection with their fc-th iterate if the distance between x 
and x fj is less than g^jzrh £ ' where Sk is the derivative of T h at the fixed point 
x u . Observe that Sk grows geometrically as k grows, so that when k is large 
j^zj — 1- In conclusion, all points x in the interval B E /2(x a ) are such that their 
e-neighborhood returns after time k: T k (B £ {x)) n B £ (x) ^ 0. Of course, not 
all of these intervals, labelled by a, are disjoint among themselves, both with 
the same and different length of a. Therefore, two conditions are to be met to 
assess a genuine first-return. 

We may approximately assume that the first condition (non-overlapping of 
the interval B £ /2(x a ) with other intervals associated with the same period, \a\ — 
k), is met when f^rj-e is less than the geometrical size, i(C a ), of the cylinder 
that contains the fixed point. More simply, because of our previous observation, 
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we may just require e < i(C a ). Let again := {0, . . . , M — l} k be the set of 
all words of length k. Within this set we therefore define the subset 

L £ , fc := {a g S fc s.t. e < £(C a )}. (52) 

The second condition (non-overlapping of B e / 2 [x„ ) with intervals of smaller 
periods |er|) is more subtle, and can be resolved by considering, among all fixed 
points x a of period k, only the primitively periodic ones. We so define the set 

Wk := {<r € E/j s.t. there is no j < k s.t. a is periodic of period j}. (53) 
Summing up, we can write 

p(e,k)= ]T »( B eM)> ( 54 ) 

where x a is the periodic point in the cylinder C a . This last expression can 
be further simplified, using a similar approximation to that employed in Sect. 
[3J In fact, we can write fi(B e (x cr )) ~ e"^ 1 '', where a Pl (x IT ) is the local di- 
mension of the measure \i at x a . This last quantity can then be extrapo- 
lated from the measure and the geometric length of the cylinder C a : a^{x a ) ~ 
log(/i(C (T ))/log(^(C (T )). As a consequence, eq. (IBTl) becomes: 



p(e,k)= £ losWC ' ),/logWC ' )) . (55) 



We have compared numerically the function p(e, k) for the case of Fig. [8] of 
the previous section and its approximation, eq. (|55[) . in Fig. 1111 Agreement is 
rather satisfactory. 

Eq. ([55]) can also be written 

p(e,k)= ]T ^los^/M^)). (56) 
eeL E , k nw k 

This last equation is particularly meaningful. Observe first that this generalizes 
cqs. (|2"Tj) and following and can be used to the same scope. Secondly, put 
X a := — \og(£(Ca-))/\a\. Then, fi almost surely, when \a\ tends to infinity, A CT 
converges to the Lyapunov exponent, A, of the measure /i. Therefore, 

p(e,k)= M(C CT )- log(e)A|CT| . (57) 

aeL c<k nw k 

Suppose now to choose e and \a\ such that — 6 loge = |ct|A, with < S < 1. This 
choice has two effects. First, the exponent in the previous equation becomes 4. 
Second, the measure of the set L e ^ tends to one, when e tends to zero, because 
/i almost surely —log(£(C a ))/\a\ tends to A, so that almost surely l{C a ) > e. 
Hence, 

Pfe-^)* E (58) 
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log p(e,k) 




Figure 11: Distribution function p(e,k) (green lines) from the original data in 
Fig. [S]and approximation from formula eq. (|55p (red lines) 



If we now let k to be a prime number, the set Wk contains all words except 
the "fixed points" cr; = a, for i = 0, . . . , k — 1, where a 6 {0,...,M-1}, In 
general, one should also subtract all words of shorter periods that divide k, as 
done above. Call this set of words F^. Then, 

P(e,- 6 -^) * £ ^Crf - £ MOO*. (59) 

It could be shown that, in systems with sufficiently fast decay of correlations like 
Bernoulli or Markov, the first term is dominant in the limit, so that discarding 
the second when taking logarithms and dividing by loge, one gets 

1 , / <5 log(e) . 1 , , / ^ % 1 \ <5 1 , . / ^ % 1 \ 
logp e, pa =- log (J2 7 =TTlog E M ^ 7 ■ 

(60) 

Taking the limit, we obtain a new verification of eq. (I51|) : 

In the next section, we shall see a generalization of this equation. 
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6 Linear automorphisms of the two-dimensional 
torus 



The general framework presented in the last section can be easily extended to 
treat the case of linear automorphisms of the two-dimensional torus, of which 
the Arnol'd cat map is the most celebrated example. For convenience, we choose 
a metric in the torus such that balls of radius e are euclidean squares of side 2e 
with sides oriented along the stable and unstable directions and for simplicity 
we consider the case when these directions are orthogonal. Then, one can easily 
show that around any fixed point of the k-th iteration of the map there exists 
a rectangle, with sides oriented in the stable and unstable directions, of points 
x whose e-balls intersect their image after k iterations. Letting A_ and A + the 
(increasingly ordered) eigenvalues of T, the sides of this rectangle have length 

e(l+ xk 2 X ) an d e(l+ A t 2 _ 1 )- As it turns out, for the Arnol'd cat and other area- 
preserving maps, these quantities are equal (since A + A_ = 1) and the rectangle 
of initial conditions just described is a square. Figure [T2l draws these squares at 
a fixed value of e. 

We can then repeat a two-dimensional generalization of the arguments of 
the previous section. This we will do elsewhere, but we will provide the result 
below. In fact, an even simpler argument can be sketched. We have seen above 
that a square of area £ 2 (1 + x k 2 _ 1 ) 2 exists at each fixed point of T k and is 

characterized by return times fc, or less. This area quickly becomes e 2 to a good 
approximation. Moreover, the number of periodic points of T k grows like A+ . 
Then, when e is "small" with respect to k, neglecting all other considerations, 
we can write 

p(e,k)~e 2 \ k + . (62) 

Of course, we have to make precise what we mean by "small" . This is when 
s 2 \\ < 1. Equality holds for k e = — ■ Indeed, following the results 

reported in Sect. El log| - 2 A+ ) is the almost sure limit of T [^{J^P , since it coincides 

w ' tn iog(A + ) — log (A ) ' see e( l' Moreover, in this case -Di(^) = 2, the 

invariant measure being the Lebesgue measure and the entropy hn is equal to 
the Lyapunov exponent A = log(A + ). Figure [Ol draws the distribution function 
p(e, k) for a different toral automorphism, just chosen to increase variety: that 
associated with the matrix (1,2; 2, 5). The logarithmically flat approximation 
in eq. (|62|) fits the data almost perfectly in the region e 2 \\_ < 1. 

If we now turn our consideration to the line k — Sk £ in the (fc,loge) plane, 
with < S < 1, we can prove that the quantity p{e,5k e ) verifies in this two- 
dimensional case the analogue of eq. (f2"6"| : 

The detailed proof, obtained along the lines of Sect. [5]will be reported elsewhere. 
We simply compute here the two sides of the equality (|63|). showing that they 
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are equal. From eq. (f6"2")> we can compute the l.h.s., obtaining 

lim oI ^l 0gP ( £ ,-^iM) = 2(l-.). (64) 

On the other hand, the Renyi entropies for the Lebesgue measure and the 
Arnol'd cat dynamics are all equal to A = log(A + ), so that also the l.h.s. of 
cq. (j6"3"j) is equal to 2(1 — S). 

We conclude this section by taking inspiration from eq. (|63[) to put forward 
a conjecture. We believe that, letting r\ be the almost sure limit of T ^{}^ , 
see eq. (fTU|). and letting k e — —r\ log e as before, under sufficient hypotheses of 
mixing, one has the following asymptotic behavior: 

Um J_] og p( Ej j fce ) = 1). (65) 

e^O log £ Tj 

Quite evidently, further investigation is required to confirm this conjecture. We 
can now turn to conclusions. 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 



Figure 12: Initial conditions x in the two-dimensional torus color coded accord- 
ing to the return time of the ball B e (x) of radius e — 0.15 for the Arnol'd cat 
map. (r = 1 red, 2 green, 3 blue, 4 magenta, 5 light blue) 



7 Conclusions 

We have studied in this paper various aspects of the statistics of return times 
of sets in dynamical systems. 
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Figure 13: Distribution function p(s, k) for the toral automorphism associated 
with the matrix (1, 2; 2, 5). 

We have first reviewed known results for symbolic ■(/'-mixing systems, that 
link return times and Renyi entropies. We have established new "counting 
rules" , embodied in the sets W n ,k and in the lemma of eq. (f2"5|) that have per- 
mitted us to explain the slow convergence of the quantities studied in previous 
works. At the same time, these results have lead to the definition of a new 
"partition function" Z p (S,n), eq. (f33|) . that best achieves the goal of extracting 
Renyi entropies from return times statistics. 

When considering return times for balls, we have established a general re- 
lation holding for one-dimensional expanding maps, eq. (|5ip . that links the 
asymptotic of return times with Renyi entropies and the Lyapunov exponent. 
This relation has been obtained developing two different approaches. The former 
is a quantitative comparison between balls and dynamical cylinders especially 
developed for this case. The second is a more general argument that well de- 
scribes the full behavior of the statistics p(e, k), comprised in eq. (f55]> . 

We have finally considered linear automorphisms of the two dimensional 
torus, like the Arnol'd cat map, for which a "quick and dirty" analysis is capable 
of describing the correct behavior of the distribution function p(e, k). This has 
permitted us to write the formula in eq. (|63[) that links return times statistics 
and Renyi entropies with 77, the almost-sure value of the limit in eq. (|10p . This 
formula is an extension of that obtained for one-dimensional systems and we 
have conjectured that it should hold in much more general situations than the 
one presented in this paper. 
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As stated in the Introduction, the character of this paper is tailored to the 
audience expected for this volume, that comprises both specialists in dynamical 
systems and in other disciplines. We have therefore tried to present our results 
in the most transparent form, while renouncing at times to full rigor in favor of 
clarity. We are nonetheless convinced that most of the theory developed here 
touches upon new ideas and approaches, and presents more than valuable hints 
to where a rigorous treatment will be developed, as we plan to do in forecoming 
publications. 
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